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Abstract 

We use the correlation matrix of stocks returns in order to create maps of the Sao Paulo Stock Exchange 
(BMF-Bovespa), Brazil's main stock exchange. The data reffer to the year 2010, and the correlations between 
stock returns lead to the construction of a minimum spanning tree and of asset graphs with a variety of 
threshold values. The results are analised using techniques of network theory. 

1 Introduction 



The BM&E-Bovespa (Bolsa de Valores, Mercadorias e Futuros de Sao Paulo) is the major stock exchange in 
Brazil, with a market capitalization of US$ 927 million in 2010. It is also a good representative of a stock 
exchange of an emerging market, and has received increasing attention by international investors in the past 
years. 

The aim of this article is to present a map, actually two maps, of the BM&F-Bovespa in 2010. In order to 

do so, I shall use some techniques taken from Random Matrix Theory [Tj, first developed for the use in nuclear 

\q • physics and then used in many areas, including finance (see [2] for a comprehensive list of contributions). There 

. are many studies of networks built from data of financial markets around the world, mainly based on the New 

York Stock Exchange [3] -[18], but also using data from Nasdaq [18], the London Stock Exchange [19] [20], the 

• ' Tokyo Stock Exchange [19] [21] , the Hong Kong Stock Exchange [19] , the National Stock Exchange of India [22] , 

—\ ■ the Global financial market [11] [19] [23] -[26], the USA Commodity market [27j . the foreign currency market 

[28j-|31j, and the world trade market [32]- [35]. Until the present date, to the author's knowledge, no work has 

been done in using the stocks of BM&F-Bovespa as a source of data for developing networks. 

Both maps are done using the time series of the 190 stocks of BM&F-Bovespa that were negotiated every 
K> , day the stock exchange was open (a list is given in Appendix A), so that those stocks are all very liquid. From 
H the time series of daily prices, I obtained the series of log-returns, given by 



S t = ln(P 4 ) - ln(P t ^) « Pt Z*- 1 , (1) 

where Pt is the price of a stock at day t and Pj_i is the price of the same stock at day t — 1. The correlation 
matrix between all log-returns was then calculated using the data obtained for the whole year of 2010. 

The time series of stocks prices encode an enormous amount of information about the way they relate to 
each other, and only part of that information is captured by the correlation matrix of their log-returns. This 
information also presents a good amount of noise, and should be filtered whenever that is possible. In this 
work, I employ some threshold values based on simulations of randomized data based on the time series that is 
under study in order to eliminate some of that noise. The shuffled data is obtained by reordering in a random 
way every time series of every single log-return. This generates time series that have the exact probability 
distribution as the original data, but with the connections between each log-return made completely random. 
The correlation matrix obtained from the randomized data is then compared with the correlation matrix of the 
original data. All connection values that are of comparable to the ones of randomized data are then eliminated 
or marked as possibly random. 



There are many measures of correlation between elements of time series, the most popular being the Pearson 
correlation coemcent. The drawback of this correlation measure is that it only detects linear relationships 
between two variables. Two different measures that can detect nonlinear relations are the Spearman and the 
Kendall tau rank correlations, which measure the extent to which the variation of one variable affects other 
variable, withouth that relation being necessarily linear. In this work, I chose Spearman's rank correlation, for 
it is fairly fast to calculate and it is better at measuring nonlinear relations. 

The correlation matrix may then be used to create a distance matrix, where distance is a measure of how 
uncorrelated two log-return series are from one another. There are also many ways to build a distance measure 
from the correlations between data. In this work, I shall use a different metric from [3], which is a nonlinear 
mapping of the Pearson correlation coefficients between stock returns. The metric to be considered here differs 
from the aforementioned metric because it is a linear realization of the Spearman rank correlation coefficient 
between the indices that are being studied: 

uij — 1 Cij , yZ) 

where Cjj are elements of the correlation matrix calculated using Sperman's rank correlation. This distance 
goes from the minimum value (correlation 1) to the maximum value 2 (correlation -1). 

Using the distance matrix, I shall build two maps (networks) based on the stocks of BM&F-Bovespa. The 
first map shall be based on a Minimum Spanning Tree (MST), which is a graph where each stock is a node 
(vertex), connected to one or more nodes. An MST is a planar graph with no intersections in which all nodes are 
connected and the sum of distances is minimum. The number of connections is the same as the total number of 
nodes of the network, minus one. This type of representation is very useful for visualizing connections between 
stocks, but it often oversimplifies the original data. Another representation is called an asset graph, which may 
be built by establishing a threshold under which correlations are considered, eliminating all other correlations 
above the said threshold. This also makes the number of original connections drop, simplifying the information 
given by the original correlation matrix. A three-dimensional map of stocks may be built in such a way that 
the distances portraied in it are the best approximation to the real distances, and the connections obtained by 
establishing a threshold are then drawn on such representation. 

The maps are examples of two networks that can be obtained from the same original data, each with its 
advantages and drawbacks. Using the two maps, one may then be able to identify which stocks are more 
dependent on each, and also which ones are more connected to others, what may be useful when building 
portfolios that minimize risk through diversification [36]. There are measures of how central each node of 
a network is, establishing its overall importance in the web of nodes, and also ways to visualize the overall 
distribution of those measures. As we shall see, the maps help verify that stocks that belong to the same types 
of companies tend to aglommerate in the same clusters, and that stocks with weak correlations often tend to 
connect at random in those networks. 

The MST is built in section 2, its centrality measures are shown in section 3, and their cumulative distri- 
bution functions are studied in section 4. The assets graphs are built in section 5, and the centrality of one 
of them are studied in section 6. Section 7 discusses the k-shell decomposition of one of the asset graphs, and 
section 8 presents a conclusion and general discussion of results. 

2 Minimum spanning trees 

Minimum spanning trees are networks of nodes that are all connected by at least one edge so that the sum 
of the edges is minimum, and which present no loops. This kind of tree is particularly useful for representing 
complex networks, filtering the information about the correlations between all nodes and presenting it in a 
planar graph. 

As discussed in the introduction, I shall employ simulations of randomized data in order to establish 
a distance threshold above which correlations are seen as possibly of random nature. The result of 1000 
simulations of randomized data is a lower threshold 0.69 ± 0.02, so distances above this value are represented 
as dashed lines in the minimum spanning tree diagrams. 

The network formed by the stocks of the BM&F-Bovespa which were negotiated every day the stock exchange 
functioned during 2010 have six main hubs, which are displayed in figure 1. The stocks are VALE3 and VALE5, 



belonging to Vale, the major mining industry in Brazil and one of the largest in the world, BBDC4, stock from 
Bradesco, one of the major banks in Brazil, BRAP4, which is a branch of Bank Bradesco responsible for its 
participations in other companies, GFSA3, stocks from GAFISA, and PDGR3, stocks from PDG Realty, both 
construction and materials companies. 



VALE3 BBDC4 GFSA3 

BRAP4 • •-• • • • PDGR3 

VALE5 

Figure 1: main axis of the minimum spanning tree for 2010. 

For reasons of clarity, I shall divide the minimum spanning tree into five diagrams, or clusters, each one 
constructed around one of the five main hubs. The first cluster we shall study in more detail, figure 2, is the one 
formed around BRAP4, Bradespar, which is a company created when Bradesco (banking) was dismembered. 
It manages the participations of Bradesco in other, non-financial companies, particularly CPFL and Vale. As 
can be seen from the excerpt of the minimum spanning tree, it has strong ties with VALE3 (of Vale, a mining 
industry of which it has about 17% of the stocks). It is immediately surrounded by 13 stocks, three of them 
of financial background (PINE4 and ABCB4 are both stocks of banks, PSSA3, of an insurance company), 
health (TEMP3 and AMIL3), sanitation (SBSP3, itself linked with stocks of another sanitation company, 
CSMG3), mining (VALE3 and MMXM3), petrochemistry (GPCP3), logistics (ALLL3, LLXL2, and OHLB3), 
and agribusiness (SLCE3). Indirectly and weakly connected (as shown by the dashed line) to this hub are the 
stocks of another insurance company (SULAll). This is a somewhat mixed cluster, with a variety of stocks 
orbiting the stocks of a very capitalized bank (Bradesco is the third major bank of Brazil). 

PSSA3 
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Figure 2: cluster around BRAP4. 

The second and third clusters are actually the same, and they are represented in different figures in order to 
make it easier to discern the organization of the stocks arounds VALE3 and VALE5. Beginning with figure 3, 
one can see a cluster formed around VALE3, which is strongly correlated with VALE5, as it would be expected. 
Vale is a mining company, and one can observe that connected to it, on its top, there is a collection of other 
stocks of mining companies (CSNA3, USIM3, and USIM5), of metalurgy (GGBR3, GGBR4, and GOAU4, all 
of them from the Gerdau group, and MAGG3), and of petrochemistry (UNIP6). Also connected, indirectly, to 
VALE3, are the stocks PETR3 and PETR4, of Petrobras, an oil, gas, and biofuel company which is one of the 
major in the world. Petrobras is responsible for a large amount of the volume traded in the Bovespa, and it is 
surprising to find it not as a hub itself in the minimum spanning tree representation. 



A likely explanation is that the main product of Petrobras, petroil, is a comodity negotiated worldwide and 
so much more dependent on factors like the price of the oil barrel then on internal ones. This is enough to 
place Petrobras' stocks apart from the others. 
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Figure 3: cluster around VALE3. 

At the lower part of figure 3, one can see a cluster of stocks related with electricity distribution (ELET3 and 
ELET6, ELPL4, CPFE3, ENBR3, LIGT3, CPLE6, and CMIG3 and CMIG4), most sparsely correlated with 



one another. To the left, there is a cluster of stocks related with telecommunications (TNPL3 and TNPL4, 
TMAR5, and BRT03 and BRT04). There are also some more stocks, related with engineering and materials, 
logistics, food, heavy machinery, consumer goods, and health, scattered and not forming any particular cluster. 
Note that most of these stocks, arranged apparently at random, have weak connections, as shown by the dashed 
lines. 

Figure 4 shows the cluster around VALE5, which is the same cluster as the one of figure 3, since VALE3 
and VALE5 are intimately connected. At the right of VALE5, one can see a cluster of stocks belonging to 
companies related to agriculture (FFTL4), food (BRFS3, MRFG3, and RNAR3), paper (the sequence made 
by FIBR3, SUZB5, and KLBN4), and sugar and ethanol (SMT03). There are also stocks belonging to other 
companies, but they apparently do not form clusters. 
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Figure 4: cluster around VALE5. 



The cluster in figure 5 is built around BBDC4 (closely connected wit BBDC3), which are stocks of Bank 
Bradesco. It is a denser cluster, comprised in its center by stocks related with banks (ITUB3, ITUB4, and 
ITSA4, all related with Bank Itau, the second largest in Brazil, SANB3, SANB4, and SANB11, stocks from 
Bank Santander, BBAS3, stocks of the Bank of Brazil, the largest in Brazil, and BRSR6) or with investment 
and finance companies (GPIV11, RDCD3), and BVMF3, wich are stocks of BM&F-Bovespa itself. To the left, 
there are two stocks of COSAN (CSAN3 and CZLT11), a food, sugar, and ethanol company. To the top and 
right, there is another cluster, of cyclic consumer goods: beverages (AMBV3 and AMBV4), tobacco (CRUZ3), 
pharmaceuticals (DROG3), and sandals (ALPA4). Also to be noticed are the sequences RDCD3-CIEL3, both 
related with companies that operate and sell credit and debit card terminals, and GOLL4-TAMM4, both related 
with air transport companies. Another small cluster comprises the stocks TCSL3 and TCSL4, and VIV04, 
related with mobile telephony companies. Other stocks scatter around the main network without forming any 
discernible subnetwork. 
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Figure 5: cluster around BBDC4. 



The next cluster, represented in figure 6, is centered around a construction and real state company, Gafisa 
(GFSA3), and is composed mostly by the stocks of other construction and real state companies (PDGR3, 
CYRE3, CRDE3, EVEN3, INPR3, JFEN3, and LPSB3), and by stocks of consumer goods companies (LAME3 
and LAME4, LREN3, BTOW3, NATU3, AMAR3, HGTX3, and MTIG4). There is also a small cluster (lower 
part the figure) of stocks of electricity distribution companies (GETI3 and GETI4, and TBLE3). 
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Figure 6: cluster around GFSA3. 



The last cluster, figure 7, is also centered around the stock of a construction and real state company, 
PDG Realty (PDGR3), and is composed mostly of stocks of other construction and real state companies 
(GFSA3, EZTC3, RSID3, MRVE3, JHSF3, and CCIM3), real state management of shopping centers (BRML3, 
MULT3, and IGTA3), and of building materials (DTEX3). Other stocks belong to electricity, logistics and 
transportation, consumer goods and stocks of other types of companies. 
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Figure 7: cluster around PDGR3. 



3 Centrality measures 

In network theory, the centrality of a node, or how influential the vertex is in the network, is an important 
measurement which is handled in a number of diferent ways. In what follows, I perform an analysis of the 
centrality of vertices in the network depicted by the MST of the last section according to five different definitions 
of centrality. I also do some analysis of the frequency distribution of each centrality measure in the network 
and which are the stocks that are more central according to each definition. 

Node degree 

Next, I analize the node degree of the stocks in the network. The node degree of a node (stock in our case) 
is the number of connections it has in the network. Most of the stocks have low node degree, and some of 
them have a large number associated with them. The latter are called hubs, and are generally nodes that are 



more important in events that can change the network. Table 1 and figure 8 show the node degree frequency 
distribution of the network of the BM&F-Bovespa 2010. 
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Table 1: node degree distribution. Figure 8: node degree frequency distribution. 

The stocks with highest node degree are BRAP4 (node degree 13), BBDC4 (node degree 12), VALE5 (node 
degree 10), and VALE3 (node degree 8). 

Node strength 

The strength of a node is the sum of the correlations of the node with all other nodes to which it is connected. 
If C is the matrix that stores the correlations between nodes that are linked in the minimum spanning tree, 
then the node strength is given by 



n: 



2_^Cik + 
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kj j 



(3) 



where Cij is an element of matrix C. 



In our network, the vertices with highest node strength are BBDC4 (N s = 6.98), BRAP4 (N s = 5.37), 
VALE5 (N s = 4.72), and VALE3 (N s = 4.50). The node strenght frequency distribution is shown in figure 9. 
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Figure 9: node strength frequency distribution. 



node strength 



Eigenvector centrality 

Node degree may be seen as a bad representation of how important a node is, for it doesn't take into account 
how important the neighbors of a node may be. As an example, one node may have a low dregree, but it may 
be connected with other nodes with very high degree, so it is, in some way, influent. A measure that takes 



9 



into account the degree of neighbouring nodes when calculating the importance of a node is called eigenvector 
centrality. In order to define it properly, one must first define an adjency matrix, A, whose elements Oy are 1 
if there is a connection between nodes i and j and zero otherwise. If one now considers the eigenvectors of the 
adjency matrix, and choosing its largest value, one then may define the eigenvector with largest eigenvalue by 
the equation 

AX = XX , (4) 

where X is the eigenvector with the largest eigenvector A. The eigenvector centrality of a node i is then defined 
as the ith element of eigenvalue X: 



E r 



(5) 



where X{ is the element of X in row i. 

In the present network, the vertices with highest eigenvector centralities are BBDC4 (E c = 0.502), VALE5 
(E c = 0.373), VALE3 (E c = 0.294), and BRAP4 (E c = 0.257). The eigenvector centrality frequency distribution 
is shown in figure 10. 



. 


i frequency 








100- 












80- 












60- 












10- 












20- 



















0.1 0.2 0.3 0.4 0.5 

Figure 10: eigenvector centrality frequency distribution. 
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Betwenness 

The betweennes centrality measures how much a node lies on the paths between other vertices. It is an 
important measure of how much a node is important as an intermediate between other nodes. It may be defined 
as 
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where m* is the number of shortest paths (geodesic paths) between nodes i and j that pass through node k 
and rriij is the total number of shortest paths between nodes i and j. Our network is fully connected, so we 
need not worry about mtj being zero. The betweenness centrality frequency distribution is shown in figure 11. 
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Figure 11: betweenness centrality frequency distribution. 
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The vertices with highest betweenness centrality are BBDC4 (J3 C = 12010), VALE5 (B c = 10490), VALE3 
(B c = 9197), and GFSA3 (B c = 9010). There are 106 vertices with zero betweenness centrality, what means 
that no shortest path between any two vertices in the network pass through those vertices. 

Closeness 

Another measure of centrality is the closeness centrality, which measures the average distance between one 
node and all the others. It is defined as the measure of the mean geodesic distance for a given node i, which is 
given by 






(7) 



where n is the number of vertices and dij is a geodesic (minimum path) distance from node i to node j. This 
measure is small for highly connected vertices and large for distant or poorly connected ones. In order to obtain 
a measure that is large for highly connected nodes and small for poorly connected ones, one then defines the 
inverse closeness centrality of node i as 

(8) 



a = l. 



The vertices with highest inverse closeness centrality are BBDC4 (C c = 0.00107), VALE5 (C c = 0.00106), 
GFSA3 (C c = 0.00100), and VALE3 (C c = 0.00099). The inverse closeness centrality frequency distribution is 
shown in figure 12. 
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Figure 12: inverse closeness centrality frequency distribution. 
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4 Cumulative distribution function 

Note that all frequency distributions, with the exception of the one for inverse closeness centrality (the same 
happens for the frequency distribution of the closeness centrality), are exponentialy decreasing. One say that 
those frequency distributions follow a power law of the type 



Pk = ck~ 



0) 



where pi~ is the frequency distribution for the value k, and c and a are constants. This is a characteristic of 
a diversity of complex systems, and it happens in the study of earthquakes, the world wide web, networks 
of scientific citations, of film actors, of social interactions, protein interactions, and many other topics [37]. 
Networks whose centrality measures follow this type of distribution are often called scalle-free networks, and 
this behavior can best be visualized if one plots a graph of the cumulative frequency distribution of a centrality 
in terms of the centrality values, both in logarithmic representation. Figure 13 shows the logarithm of the 
cumulative frequency distributions as functions of the logarithm of the five types of centrality measures we 
used in the last section. 
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a) Node degree x cumulative distribution 




b) Node strength x cumulative distribution 
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Figure 13: log-log plots of the cumula- 
tive frequency distributions as functions of 
centrality measures. 



e) Inverse closeness x cumulative distribution 



Were a network a pure scalle-free one, then the graphic of the logarithm of the cumulative frequency 
distribution as a function of the logarithm of a centrality measure would be a straight line. One may notice 
that this behavior is aproximately followed by the intermediate levels of all centrality measures, except for 
inverse closeness centrality. The exactly same behavior is seen in, as examples, the world wide web or networks 
of citations. 
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5 Asset Graphs 

A two dimensional representation of non-overlapping connections between vertices, such as the minimum span- 
ning tree, has some misleading features. Two nodes that are actualy close to each other may appear far or 
unconnected, and one node that is just very weakly connected may establish a connection in the MST at ran- 
dom. A three dimensional representation is often a better representation, and one can be built using distance 
as a guide. As an example, one may use an algorithm that minimizes the sum of the squares of the differences 
between the real distances and the ones portraied in a three dimensional graph. Other, more advanced tech- 
niques, can also be used, such as principal coordinates analysis. By using the latter, one obtains the following 
three dimensional representation of the stocks of BM&F-Bovespa (figure 14). 
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Figure 14: three dimensional representation of stocks of the BM&F-Bovespa. 

The picture is not very enlightening when printed on paper, but offers a good three-dimensional view when 
viewed on a computer. In particular, it may be used in order to study the distributions of stocks of the same 
type of companies, what is done in Appendix B. Here, we shall use the three dimensional representation in 
order to study asset graphs, which are networks built on the distance matrix by establishing thresholds under 
which connections are considered. As an example, one may build a network of those nodes whose distances 
with the others are below or equal to T = 0.5. This will probably exclude many of the connections and some 
of the nodes of the original network. In what follows, I perform an analysis of some asset graphs obtained by 
selecting thresholds that go from 0.1 to 0.7, which is the limit at which random noise starts to become absolute. 

At T = 0.1 (figure 15), the only connections are those between BBDC3 and BBDC4, both stocks from 
Bradesco (banking), GGBR3, GGBR4, and GOAU4 from Gerdau (metalurgy), ITUB4 and ITSA4 from Itau 
(banking), PETR3 and PETR4 from Petrobras (petroleum and gas), and between VALE3 and VALE5, both 
stocks from Vale (mining). 
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Figure 15: connections bellow threshold T = 0.1. 
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At T = 0.2 (figure 16), the earlier connections are joined by a small cluster of stocks RPMG3 and RPMG4, 
of the Refinaria de Petroleo Manguinhos (petroleum refinement), and by another small cluster formed by 
USIM3 and USIM4, of Usiminas (mining). Connections are established between BRAP4, Bradespar, which is 
an investiment branch of Bank Bradesco which has 17% of the control of Vale, and VALE3 and VALE5. A 
cluster is formed with stocks from Bank Bradesco and Bank Itau, now with the joining of ITUB3. 
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Figure 16: connections bellow threshold T = 0.2. 

At T = 0.3 (figure 17), the network is joined by the pairs TELB3-TELB4 of Telebras (telecommunications), 
INEP3-INEP4 of Inepar (construction), TNLP3-TNLP4 of Telemar (telecommunications), CMIG3-CMIG4 of 
CEMIG (electricity) , and AMB V3- AMB V4 of Ambev (beverages) . The network formed by the stocks of Gerdau 
now joins with the stocks of Usiminas, and with the newcommer CSNA3 of Companhia Sideriirgica Nacional 
(metalurgy). The new network represents a cluster of stocks of companies working in mining and meatlurgy, 
close to but still not connected with Vale. A new newtork is formed by the stocks GFSA3 of Gafisa, CYRE3 of 
Cyrela, RSID3 of Rossi Residencial, PDGR3 of PDG Realty, and MRVE3 of MRV Engenharia e Participagoes, 
all of them in the construction business. 
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Figure 17: connections bellow threshold T = 0.3. 

For T = 0.4 (figure 18, with the new stocks in the network in red, for better visualization), the mining 
and metalurgy network joins with the financial network, and the construction network becomes denser, with 
more connections established between its nodes. We also have the newcommer isolated pairs TCSL3-TCSL4 
of TIM (telecommunications), CIEL3 of Cielo and RDCD3 of Redecard, both operating in the business of 
electronic cards, SUZB5 of Suzano Papel e Celulose and FIBR3 of Fibria Celulose, both operating in the paper 
production market, and GOLL4 of Gol and TAMM4 of TAM, both pertaining to airlines. The stocks of two 
banks, SANB11 of Bank Santander and BBAS3 of the Bank of Brazil connect with the financial network via 
the stocks of Bradesco, and the stocks CZLT11 of Cosan (sugar and ethanol) connect with CSNA3 (metalurgy). 
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Figure 18: connections bellow threshold T = 0.4. 

For T = 0.5 (figure 19, with the new stocks in the network highlighted in red), all three major networks, 
mining and metalurgy, financial, and construction, are now joined, with a central role played by the finan- 
cial network. BISA3 of Brookfield Incorporacoes, EVEN3 of Even Construtora e Incorporadora, EZTC3 of 
EZETEC, all of the building industry, join the building companies network. BRML3 of BR Malls and MULT3 
of Multiplan, both of the real state business (shopping centers), form a pair close to but not connected with 
the building industry network. 
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Figure 19: connections bellow threshold T = 0.5. 

BRT04 of Brasil Telecom and TMAR5 of Telemar Norte Leste, both of telecommunications companies, 
join TNLP4 of Telemar in order to form a telecommunications network. GETI3 and GETI4, both of AES Tiete 
(electricity) form an isolated pair. LAME3 and LAME4 of Lojas Americanas and LREN3 of Lojas Renner, 
both of the retailing business, form another separate cluster. MMXM3 of MMX Mineragao e Metalicos join 
Vale and Bradespar in the mining and metalurgy network. The pair of stocks of Inepar (construction) connects 
with Vale (mining), the pair of Telemar connects with Telemar Norte Leste, the stocks of Gol (airliner) connect 
with the ones of Bradesco (banking) and Gerdau (metalurgy). BVMF3, which are stocks of the BM&F-Bovespa 
(financial), connect with the ones of Bradesco, Bradespar, Itau (all in the banking and financial sectors), Cyrella 
(building), and Gerdau (metalurgy). Petrobras (petroleum and gas) connects with Bradesco (banking) and 
Gerdau (metalurgy). 
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For T = 0.6 (figure 20, with the new stocks in the network highlighted in red), and T = 0.7, connections 
between sectors become more frequent, and already existing clusters become denser. More stocks take part of 
the complete network now, even those that have weaker correlations with other stocks. From T = 0.8 onwards, 
noise starts to takes over, and new connections are not reliable. 
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Figure 20: connections bellow threshold T = 0.5. 



6 Centrality measures for the asset graph 

I shall now analize some of the centrality measures seen in sections 3 and 4 but now applying them to the 
network obtained by fixing a threshold T = 0.7 (just bellow the noisy region) and considering only those 
distances that are bellow it. I begin by analyzing the node degree. Table 2 and figure 21 show the node degree 
distribution. 
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Table 2: node degree distribu- 
tion for the asset graph. 
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Figure 21: node degree frequency distribution. 



node degree 



The highest node degrees belong to BBDC4 and ITUB4 (node degree 103), BRAP4 (node degree 101), 
ITSA4 (node degree 95), BBDC3, and VALE5 (node degree 95). Figure 22 represents the acumulated frequency 
distribution of the node degree plotted against the node degree. 
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Figure 22: node degree x cumulative distribution 

Next, calculating the node strenght, one can see that the nodes with highest values are BBDC4 (N s = 86.87), 
ITUB4 (N s = 85.05), BRAP4 (N s = 81.22), and ITSA4 (N s = 79.93). The probability distribution function 
is shown in the figure bellow, together with the log-log plot of the cumulative distribution as a function of the 
node strength. 
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Figure 23: node strength frequency distribution. 




Figure 24: node strength x cumulative distribution. 



The eigenvector centrality, given by equation @ , measures the centrality of a node by giving weights to the 
nodes that are immediately connected with it. The highest values for eigenvector centrality are those of BBDC4 
and ITUB4 (E c = 0.172), ITSA4 (E c = 0.167), BBDC3 and BRAP4 (E c = 0.163), and VALE5 (E c = 0.160). 
The probability distribution function is shown in figure 25, together with the log- log plot of the cumulative 
distribution as a function of the eigenvector centrality (figure 26). 
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Figure 25: eigenvector centrality 

frequency distribution. 



Figure 26: eigenvector centrality x 
cumulative distribution 



If we now consider betweenness as a measure of centrality, then the highest values occur for BRAP4 
(B c = 950), BBDC4 (B c = 722), ITUB4 (B c = 701), and VALE5 (B c = 629). The probability distribution 
function is shown in figure 27, together with the log-log plot of the cumulative distribution as a function of the 
betweenness centrality (figure 28). 
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Figure 27: betweenness centrality 
frequency distribution. 
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The last centrality measure I shall analize is inverse closeness centrality, for which the highest values occur for 
BBDC4 and ITUB4 (C c = 1.866 x 10" 4 ), BRAP4 and ITSA4 (C c = 1.864 x 10~ 4 ), BBDC3 (C c = 1.863 x 1(T 4 ), 
and VALE5 (C c = 1.862 x 10 -4 ). The probability distribution function is shown in figure 29, together with the 
log- log plot of the cumulative distribution as a function of the inverse closeness centrality (figure 30) . 
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Figure 29: inverse closeness centrality 

frequency distribution. 



Figure 30: inverse closeness centrality x 
cumulative distribution 



One thing to be noticed is that the frequency distributions of the centrality measures of the asset graph are 
more distant from what would be expected from a scale-free network. This may be due to the great amonut 
of noise that comes with the choice of threshold. For higher threshold values, this difference increases, and for 
lower threshold values, it decreases. 

7 K-shell decomposition 

Another important centrality measure which can be applied only to the asset graph (and not to the MST) is 
k-shell decomposition, which consists on classifying a vertice according to the connections it makes, and also 
considers if it is in a region of the network that is also highly connected. This decomposition is frequently used 
in the study of the propagation of diseases and of information, and also shed some light on financial networks. 
It consists of considering all nodes with degree 1, assigning to them k = 1, and striping the network from 
them. Then, one looks at the remaining vertices with degree 2 or less, and assigns to them k = 2. Repeating 
the procedure, all vertices with k = 2 are removed, and one then looks for vertices with degree 3 or less. The 
process goes on until all nodes are removed. What one then obtains are shells of vertices that increase in 
importance as k goes larger. For our asset graph network for T = 0.7, there are 30 shells, and the stocks that 
belong to it are represented in figure 31. 
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Figure 31: core (k = 30) of the asset graph. 

There is a strong dependence of k-shell position and node degree, as can be seen in figure 32. With the 
exception of the innermost shell, there is an almost linear relation between the two measures. 
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Figure 32: node degree as a function of fc-shell number. 



8 Conclusion 

Using the concepts of minimum spanning trees and asset graphs, both based on the correlation matrix of 
log-returns of the BM&F-Bovespa for the year 2010, maps of that stock exchange were built, and the network 
structure so obtained was examined. One could see that that the BM&F-Bovespa has a clustered structure 
roughly based on economic acitivity sectors, and gravitates around some key stocks. The results may be used 
in order to obtain a better knowledge of the structure of this particular stock exchange, and maybe devise 
diversification strategies for portfolios of stocks belonging to it. There was also the opportunity to compare 
two different representations of the same complex system, each one with its benefits and maladies. 

Acknowledgements 

I thank for the support of this work by a grant from Insper, Instituto de Ensino e Pesquisa. I am also 
grateful to Leonidas Sandoval Neto (my father and Economist), who helped me clarify some details, and to 
Gustavo Curi Amarante, who collected the data. This article was written using FTgX, all figures were made 



19 



using PSTricks, and the calculations were made using Matlab, Ucinet and Excel. All data are freely available 
upon request on leonidassj@insper.edu.br. 

A Stocks, codes, and sectors 

Here I display, in alphabetical order, the stocks that are being used in the present work, together with the 
companies they represent and the sectors those belong to. They are not all the stocks that are negotiated 
in the BM&F-Bovespa, for only the ones that were negotiated every day the stock market opened are being 
considered. So, all the stocks have high liquidity and there is no missing data in the time series being used. 



Code 


Company 


Sector 


ABCB4 


Banco ABC 


Bancario 


AEDU3 


Anhanguera Educational 


Educacao 


ALLL3 


America Latina Logfstica 


Logfstica e transporte - trens 


ALPA4 


Alpargatas 


Tecidos, vestuario e calgados 


AMAR3 


Lojas Marisa 


Tecidos, vestuario e calgados 


AMBV3 


AMBEV 


Consumo - bebidas 


AMBV4 


AMBEV 


Consumo - bebidas 


AMIL2, 


Amil Participacoes 


Saiide 


BBAS3 


Banco do Brasil 


Bancario 


BBDC3 


Bradcsco 


Bancario 


BBDC4 


Bradcsco 


Bancario 


BBRK3 


Brasil Brokers 


Construe, ao, Materials e Engenharia 


BEEF3 


Minerva 


Alimentos 


BEMA3 


Bematech 


Tecnologia, internet e call-centers 


BICB4 


Bicbanco 


Bancario 


BISA3 


Brookficld Incorporagoes 


Construgao, Materials e Engenharia 


BVMF3 


BM&F-BOVESPA 


Financeiro 


BPNM4 


Banco Panamericano 


Bancario 


BRAPA 


Bradespar 


Invcstimentos 


BRFS3 


Brasil Foods 


Alimentos 


BRKMh 


Braskem 


Pctroquimica 


BRML3 


BR Malls 


Imoveis 


BRSR6 


BANPJSUL 


Bancario 


BRT03 


Brasil Telecom 


Tclcfonia fixa 


BRT04 


Brasil Telecom 


Felcfonia fixa 


BTOW3 


B2W - Companhia Global do Varejo 


Consumo - eletronicos 


BTTL4 


Battistella Adm 


Invcstimentos 


BVMF3 


BM&F-BOVESPA 


Financeiro 


CARDS 


CSU Cardsystem 


Servigos diversos 


CCIM3 


Camargo Correa 


Construgao, Materials e Engenharia 


CCROS 


CCR 


Logfstica e transporte 


CESP6 


CESP 


Encrgia eletrica 


CIEL2, 


Cielo 


Financeiro 


CLSC6 


Celesc 


Encrgia eletrica 


CMIG3 


CEMIG 


Energia eletrica 


CMIGA 


CEMIG 


Energia eletrica 


CNFBA 


Confab 


Siderurgia 


COCE5 


Coelce 


Energia eletrica 


CPFES 


CPFL 


Energia eletrica 


CPLE6 


COPEL 


Energia eletrica 


CRDE'i 


CR2 Emprcendimentos Imobiliarios 


Construgao, Materials e Engenharia 


CREM2, 


Cremer 


Saiide 


CRUZ3 


Souza Cruz 


Tobacco 


CSAN3 


COSAN 


Agucar e alcool 


CSMG2, 


COPASA 


Saneamento 


CSNA3 


Cia Sidcriirgica National 


Siderurgia 
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Code 


Company 


Sector 


CTAXA 


Contax 


Tecnologia, internet e call-centers 


CTIP3 


CETIP 


Financeiro 


CYRE3 


Cyrela Brazil 


Construgao, Materials e Engenharia 


CZLTll 


COSAN 


Alimentos 


DASA3 


DASA - Diagnosticos da America 


Saude 


DROG3 


Drogasil 


Consumo - farmaccutico 


DTEX3 


Duratex 


Construgao, Materials e Engenharia 


ECOD3 


Brasil Ecodiesel 


Petroleo, gas e biocombustfveis 


ELET3 


Eletrobras 


Energia eletrica 


ELET6 


Eletrobras 


Energia eletrica 


ELPLA 


Eletropaulo 


Energia eletrica 


EMBR3 


Embraer 


Industria de aviagao 


ENBR3 


EDP - Energias do Brasil 


Energia eletrica 


EQTL3 


Equatorial 


Energia eletrica 


ESTRA 


Estrela 


Consumo - brinqucdos 


ETER3 


Eternit 


Construgao, Materials e Engenharia 


EUCAA 


Eucatex 


Madeira e papel 


EVEN3 


Even Construtora e Incorporadora 


Construgao, Materials e Engenharia 


EZTC3 


EZTEC Emprecndiments c Participacoes 


Construgao, Materials e Engenharia 


FESA4 


FERBASA 


Siderurgia 


FFTL4 


Valefert 


Agronegocios 


FHER3 


Fertilizantes Heringer 


Agroncgocios 


FIBR3 


Fibria Celulose 


Papel e celulose 


FJTA4 


Forjas Taurus 


Maquinas e cquipamentos 


FLRY3 


Fleury 


Saude 


FRAS4 


Fras-Le 


Material de transporte 


GETI3 


AES Tiete 


Energia eletrica 


GETI4 


AES Tiete 


Energia eletrica 


GFSA3 


Gafisa 


Construgao, Materials e Engenharia 


GGBR3 


Gerdau 


Siderurgia 


GGBRA 


Gerdau 


Siderurgia 


GOAUA 


Metalurgica Gerdau 


Siderurgia - metalurgia 


GOLLA 


GOL 


Logistica e transporte - aviagao 


GPCP3 


GPC Participagoes 


Pctroquimica 


GP IV 11 


GP Investments 


Investimcntos 


GRND3 


Grcndene 


Tecidos, vestuario e calgados 


GSHP3 


General Shopping Brasil 


Imoveis 


HAGAA 


Haga 


Materials de construgao 


HBOR3 


Helbor 


Construgao, Materials e Engenharia 


HGTX3 


Hering 


Tecidos, vestuario e calgados 


HYPES 


Hypermarcas 


Consumo 


IDNT3 


Ideiasnet 


Tecnologia, internet e call-centers 


IENG5 


INEPAR 


Construgao, Materials e Engenharia 


IGBR3 


IGB Eletronica 


Eletronica 


IGTA3 


Iguatcmi 


Imoveis 


INEP3 


Inepar Industria e Construgoes 


Maquinas e cquipamentos 


INEP4 


Incpar Industria e Construgoes 


Maquinas e cquipamentos 


INET3 


INEPAR Telecomunicagoes 


Telecomunicagoes 


INPR3 


Inpar 


Construgao, Materials e Engenharia 


ITSAA 


Itausa 


Bancario 


ITUB3 


Itau Unibanco 


Bancario 


ITUBA 


Itau Unibanco 


Bancario 


JBDUA 


JB Duarte 


Invcstimentos 


JBSS3 


JBS 


Alimentos 


JFEN3 


Joao Fortes Engenharia 


Construgao, Materials e Engenharia 


JHSF3 


JHSP Participagoes 


Construgao, Materials e Engenharia 


KEPL'i 


Kepler Weber 


Maquinas e equipamentos 
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Code 


Company 


Sector 


KLBNA 


Klabin 


Papel e celulose 


KROT11 


Kroton Educacional 


Educagao 


LAME3 


Lojas Amcricanas 


Consumo - comercio 


LAMEA 


Lojas Amcricanas 


Consumo - comercio 


LIGT3 


Light 


Energia cletrica 


LLXL3 


LLX Logistica 


Logistica e transporte 


LOGN3 


Log-in Logistica Intermodal 


Logistica e transporte 


LPSB3 


Lopes Brasil 


Construgao, Materials e Engenharia 


LREN3 


Lojas Renner 


Consumo - vestuario 


LUPA3 


Lupatech 


Maquinas e equipamentos 


MAGG3 


Magnesita Refratarios 


Siderurgia 


MILKll 


LAEP Investments 


Alimentos 


MLFT4 


Jcreissati Participagoes 


Tclcfonia fixa 


MMXM3 


MMX Mineracao e Metalicos 


Mineragao 


MNDLA 


Mundial 


Consumo - eletrodomesticos 


MNPR3 


Minupar 


Alimentos 


MPXE3 


MPX Energia 


Energia eletrica 


MRFG3 


MARFRIG 


Alimentos 


MRVE3 


MRV Engcnharia c Participates 


Construgao, Materials e Engenharia 


MTIGA 


Metalgrafica Iguagu 


Consumo - embalagens 


MULT3 


Multiplan 


Imoveis 


MYPK3 


IOCHPE Maxion 


Material de transporte 


NATU3 


Natura 


Consumo - cosmcticos 


NETCA 


NET 


Tclccomunicagoes 


ODPV3 


Odontoprev 


Saiide 


0GXP3 


OGX Petroleo e Gas 


Petroleo, gas e biocombustiveis 


OHLB3 


Obrascon Huarte Lain Brasil 


Logistica e transporte 


PCAR5 


Pao de Agucar 


Consumo - comercio 


PDGR3 


PDG Realty 


Construgao, Materials e Engenharia 


PETR3 


Petrobras 


Petroleo, gas e biocombustiveis 


PETRA 


Pctrobras 


Petroleo, gas e biocombustiveis 


PFRM3 


Proframa 


Consumo - farmaccutico 


PINEA 


PINE 


Bancario 


PLAS3 


Plascar 


Material de transporte 


PMAM3 


Paranapanema 


Material de transporte 


POMOA 


Marcopolo 


Material de transporte 


POSH 


Positivo Informatica 


Tecnologia, internet e call-centers 


PRVI3 


Providencia 


Materials diversos 


PSSA3 


Porto Seguro 


Seguradora 


RAPTA 


Randon 


Material de transporte 


RDCD3 


Redecard 


Financeiro 


RENTS 


Localiza Rent a Car 


Logistica e transporte - aluguel 


RNAR3 


Rcnar Macas 


Alimentos 


RPMG3 


Refinaria dc Pctroleos Manguinhos 


Petroleo, gas e biocombustiveis 


RPMGA 


Refinaria de Pctroleos Manguinhos 


Petroleo, gas e biocombustiveis 


RSID3 


Rossi Rcsidencial 


Construgao, Materials e Engenharia 


SANB11 


Banco Santandcr 


Bancario 


SANB3 


Santandcr 


Bancario 


SANBA 


Santandcr 


Bancario 


SBSP3 


Sabesp 


Saneamento 


SFSAA 


Sofisa 


Bancario 


SGPS3 


Spring Global 


Tecidos, vestuario e calgados 


SLCE3 


SLC Agricola 


Agronegocios 


SLEDA 


Saraiva 


Consumo - livros 


SMT03 


Sao Martinho 


Agucar e alcool 


SULA11 


Sul America 


Seguradora 


SUZB5 


Suzano Papel e Celulose 


Papel e celulose 
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Code 


Company 


Sector 


TAMMA 


TAM 


Logistica e transporte - aviagao 


TBLE3 


Tractebel 


Energia eletrica 


TCN03 


Tecnosolo Engenharia 


Construcao, Materials e Engenharia 


TCSA3 


Tecnisa 


Construcao, Materials e Engenharia 


TCSL3 


TIM 


Telefonia movel 


TCSLA 


TIM 


Telefonia movel 


TEKAA 


TEKA 


Tecelagem 


TELB3 


Telebras 


Telefonia fixa 


TELBA 


Telebras 


Telefonia fixa 


TEMP3 


Tempo 


Saiide 


TGMA3 


Tegma Gestao Logistica 


Logistica e transporte 


TLPP3 


Telesp 


Telefonia fixa 


TLPPA 


Telesp 


Telefonia fixa 


TMAR5 


Telemar Norte Leste 


Telefonia fixa 


TNLP3 


TELEMAR 


Telefonia fixa 


TNLPA 


TELEMAR 


Telefonia fixa 


T0TS3 


Totvs 


Tecnologia, internet e call-centers 


TOYB3 


TEC TOY 


Consumo - brinquedos 


TOYBA 


TEC TOY 


Consumo - brinquedos 


TPISi 


Triunfo 


Logistica e transporte 


TRPLA 


CTEEP 


Energia eletrica 


UGPAA 


Ultrapar 


Invcstimcntos 


UNIP6 


Unipar 


Petroqufmica 


UOLLA 


UOL 


Tecnologia, internet e call-centers 


USIM3 


Usiminas 


Siderurgia 


USIM5 


Usiminas 


Siderurgia 


VALE3 


Vale 


Mineragao 


VALE5 


Vale 


Mineragao 


VIVOA 


VIVO 


Telefonia movel 


VLID3 


Valid 


Servigos diversos 


WEGE3 


WEG 


Maquinas e equipamentos 



Table 3: stocks, companies, and sectors to which they belong. 

B Sector distribution 

In the following figures, I present the distribution of stocks according to sectors in three dimensional maps. 
One may see that stocks from the same sectors tend to aglommerate. 
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Figure 33: banking (red), financial (blue), and holdings (black) sectors. 
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Figure 34: mining (red), ironworks (blue), and petroleum (black) sectors. 
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Figure 35: electricity (red), sanitation (blue), and education (black) sectors. 
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Figure 36: telecommunications (red), cable TV (blue), and technology, internet, and call-centers (black) 

sectors. 
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Figure 37: food (red), sugar and ethanol (blue), and agribusiness (black) sectors. 
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Figure 38: construction, materials, and engineering (red), real state (blue), and insurance (black). 
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Figure 39: consumer goods (red), electronics (bue), and wood and paper (black). 
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Figure 40: logistics and transportation (red), materials for transport and general (blue), airplane industry 

(black). 
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Figure 41: equipments and machinery (red), general services for payment (blue). 
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